Factorial Design Expert System

ABSTRACT

An automated expert system that uses split-run and factorial design methods to determine which factors are most important in an experiment. The expert system is architected into Design, Execute and Evaluate phases, to assist a user in developing a Factorial Design experiment in which one, two or three factors are tested simultaneously. In a preferred embodiment, a database infrastructure and web client, browser-based methodology functions as the expert system (a “wizard”) to design experiments, build control groups and evaluate results, all with the goal of discovering what values for which factors will yield the optimum response from subjects.

RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No.11/517,174, entitled “Factorial Design Expert System,” filed Sep. 7,2006, which is related to U.S. patent application Ser. No. 11/517,180,entitled “Predicting Response Rate,” filed Sep. 7, 2006, and U.S. patentapplication Ser. No. 11/517,175, entitled “Online Direct MarketingSystem,” filed on Sep. 7, 2006. The entire teachings of the aboveapplications are incorporated herein by reference.

BACKGROUND OF THE INVENTION

The present invention is generally related to statistics, marketing, andexperimental design and more particularly related to an expert systemthat uses split-run and factorial design methods to determine whichfactors are most important in an experiment.

Marketing is a process through which a company induces new and existingcustomers to buy its products and services. One familiar type of amarketing activity is advertising, where a company broadcasts itsmessage to whomever is viewing the medium carrying the advertisingmessage, for example newspapers, television, billboards, web sites, eventhe sides of buses. Another type of marketing activity is directmarketing, in which a company tries to address its prospects andcustomers individually through postal mail or email.

Targeting is the process of selecting potential buyers, perhaps forparticular products or their likelihood of making a purchase in the nearfuture or because they may be in danger of defecting, among otherreasons. Properly done, targeting should also include predicting theresults of the actual campaign. Testing is the process of experimentingto determine the most effective offers or the right customers to target.Campaigning involves contacting the targeted customers by appropriatemedia, such as email or direct mail.

Like other researchers, marketers need to carefully design experimentsto effectively test marketing campaigns. For example, they may need todetermine whether an optimum discount is, for example, 5% or 10% or a$10 off coupon. They typically need to determine what level ofpersonalization is most effective, and what communications channels workbest. Each of these variables are called Factors. To get the answers toquestions like this, Marketers design small campaigns to test what valueof each factor works best.

Historically the process has been to use a main population and a controlgroup to measure the effect of a factor. The main population gets acampaign with value1 for factor 1. Factor 1 might be discount couponrate. Value1 might be $10 off on $50 of purchases. The control groupconsists of a population with the same characteristics as the maingroup, but with a different value, value2, for the factor, for example$20 off on $100 of purchases. This simple kind of design is called A/B,Split-run, or more commonly One Factor At a Time (OFAT) design. Only onefactor is changed. When OFAT design is used, several campaigns areneeded to test multiple factors.

Advances in statistical analysis have led to a much improved methodologycalled factorial design in which several factors can be tested in thesame campaign. Adoption of factorial design has been slow because testscan be difficult to design and hard to interpret, especially when thenumber of factors grows or partial factorial designs are used. Howeverfactorial design experiments have several advantages:

-   -   Results are obtained sooner because multiple campaigns are not        needed    -   Costs are lower because smaller subject populations are used and        fewer campaigns are launched    -   Interactions between factors can be measured, which is close to        impossible with split-run designs

SUMMARY OF THE INVENTION

For many years, the only acceptable design for an experiment wassplit-run, or One Factor At a Time (OFAT). More recently, marketingresearchers have recognized the validity of Factorial Design, in whichmultiple factors can be tested simultaneously.

Testing several factors together is faster, less expensive, and revealsthe interactions among the factors. However factorial design isconceptually harder to understand for experimenters not well versed instatistics, and correspondingly harder to interpret, for example, by atypical small businessperson.

To overcome these obstacles, the invention describes an expert systemthat is architected into Design, Execute and Evaluate phases, to assista user in developing a Factorial Design experiment in which one, two orthree factors are tested simultaneously. In a preferred embodiment, adatabase infrastructure and web client, browser-based methodologyfunctions as the expert system (a “wizard”) to design experiments, buildcontrol groups and evaluate results, all with the goal of discoveringwhat values for which factors will yield the optimum response from thesubjects.

More particularly, in the Design phase of the system, the user is askeda series of questions to determine what is to be tested, how manyfactors are involved, desired size of the test population, andinformation about any groups of customers that should be included orexcluded from testing. Based on this information, the system thencreates subgroups and assigns specific values (“treatments”) for thefactors in each subgroup. The test population is recorded in a database.

In the Execution phase of the system, which may be implemented in anOnline Direct Marketing System, each member of a treatment subgroupreceives email or direct mail with the appropriate treatment—typicallyan offer to buy some product or service. After a suitable period oftime, transaction data is returned to the system for evaluation.

In the Measurement and Evaluation phase, the transactions from eachsubgroup are analyzed according to the formulae in the invention. Maineffects and interaction effects are calculated. Results are presented tothe experimenter in the form of a table and bar chart so theexperimenter can determine which values for the various factors yieldthe best responses.

The user can extend the expert system using more factors in astraightforward way.

In one preferred embodiment, the expert system is part of an OnlineDirect Marketing System delivered over the Internet through a web client(browser). However the invention will work equally well running on alocal system and dealing with experiments far afield from marketingcampaigns.

The objective of this expert system is to enable relativelyunsophisticated experimenters to design and execute effectiveexperiments just by answering questions posed by the system.

While the preferred embodiment is for use in marketing experiments thatinclude testing and targeting of customers or campaign offers, theconcepts disclosed herein permit extension of the system to many otherfields and types of experiments.

BRIEF DESCRIPTION OF THE DRAWINGS

The foregoing will be apparent from the following more particulardescription of example embodiments of the invention, as illustrated inthe accompanying drawings in which like reference characters refer tothe same parts throughout the different views. The drawings are notnecessarily to scale, emphasis instead being placed upon illustratingembodiments of the present invention.

FIG. 1 is a high level diagram of a system environment in which thefactorial design expert system may be implemented.

FIG. 2 is a sequence of steps performed in a Design phase.

FIG. 3 is a sequence of steps performed in an Data Collection(Execution) phase.

FIG. 4 is a sequence of steps performed in an Evaluation phase.

DETAILED DESCRIPTION OF THE INVENTION

A description of preferred embodiments of the invention follows. Thefollowing definitions are used in this document:

Definitions

-   -   Factor—a variable to test; examples include but are not limited        to collateral size, discount, extent of personalization,        communications channel, etc.    -   Level—the number of different values for a factor; typically        there are two (5% off, 10% off) or three different levels; in a        two level design, the higher value is typically denoted by a        plus sign (+) and the lower value is denoted by a minus sign (−)        when the values are numerical; the values are not necessarily        numerical    -   Treatment—the level value delivered to a subject; for example if        the factor is discount level and the level values are 5% and        10%, then those subjects offered the 10% discount are said to        have been given the + treatment    -   Control group—a subset of the subject population that is set        aside for a different treatment to determine the effect of a        factor; the control group can also be one of the combinations of        the different treatments    -   Recipe—the collective set of treatments given to a subject; for        example in a two factor, two level design, one recipe is to give        a subject the + treatment for both factors    -   Response—what the subject does after receiving the treatment;        for example the subject makes a purchase    -   Main effect—the response result due to a particular factor    -   Interaction effects—response results due to a combination of        factors; how one factor influences another

Process

There are three phases to an implementation of an expert FactorialDesign system according to the invention. These encompass the high levelprocess or steps (phases) of creating, conducting, and analyzing tests,namely:

I) Creating Tests, in which the test is created and executed;

II) Data Collection, in which the results of the test are gathered andcounted; and

III) Evaluation, in which the collected results are analyzed anddisplayed.

Each of these high level phases will now be described in detail.

Phase I. Creating Tests

Referring to the Environment Diagram, FIG. 1, in this first phase of theprocess, Marketers 100 upload customer transaction data 125 via theInternet 110 to a Data Store 120. Then Marketers 100 interact via theInternet 110 with the Expert System 130 to create a test 140. The ExpertSystem 130 uses the transaction data 125 to collect a population ofcustomers 160 from which to extract a subset for the test. Next, theMarketer 100 interacts via a web browser with the Expert System 130 toset the parameters and filters for test groups.

FIG. 2 shows some of the steps performed in creating tests 140 in moredetail. In step 202, a test population is selected. The test populationis defined using parameters such as size and characteristics.Undesirable attributes of the population may also be determined, andsuch members suppressed, in step 204. Next, in step 206, the desiredtype of test 140 is defined by the Marketer 100. This definition mayinclude types of offers, or types of targeting.

As part of defining the test, specific Factors to include in a test 140are also determined in step 208. There may be a range of values (low andhigh) specified for each Factor in step 210; there may be up to threeFactors for each test 140. Specific examples of Factors as used in thedefinition of a test 140 are described in more detail below in thesection on the design of Data Store 120.

Finally, in step 212 the test parameters are stored data is collectedfrom the user that specifies the nature of the test 140, e.g., gatheredthrough a series of questions presented in a web-client (browser)application, Answers to the questions are stored in the appropriatedatabase tables, described below.

As described in more detail below, the test 140 creation processautomatically creates as many subgroups as necessary to deploy a fullfactorial experimental design for the number of factors to be tested.

For example, if L is the number of levels and F is the number offactors, the number of subgroups needed is L̂F, or L raised to the Fpower. When L=2 and F=3, then 2̂3=8 subgroups are needed. Subjects(Customers 160 from the test population) are randomly assigned to thesubgroups by the Expert System 130.

The Marketer 100 downloads these subgroups and carries out the testcampaigns in which each subgroup gets a communication that implementsthe recipe for that subgroup. Alternatively, the Expert System 130itself could send the emails or printed materials via email and/or printengines 150 to customers 160.

Phase II. Data Collection

After a suitable period of time, the Marketer 100 again collectstransaction data that details which of the test campaign recipientsresponded in what ways, and again uploads the transaction data to theexpert system. The Marketer 100 then interacts once more with the ExpertSystem for the Data Collection phase. FIG. 3 shows the steps for thisphase of the process.

For this part of the Data Collection process, the Marketer 100identifies the campaign, in step 302, and test 140, in step 304, so theExpert System 130 knows which customers 160 are in the test population.Then the Expert System 130 collects the transaction data in step 306from those customers for the time period defined by the test. Then, instep 308 the Expert System 130 calculates the response rates for thevarious recipes, which are then displayed in step 310.

Phase III. Evaluation

Now the Expert System 130 is ready to proceed to the Evaluation Phase,where the effects of each factor are calculated as well as the effectsdue to interactions between the factors. FIG. 4 shows the steps in thisphase of the process.

From the response data, the main effect and the interaction effects ifany are calculated in steps 402 and 404 according to the formulaedescribed below. The main and interaction effects are presented in atable and/or as bar charts in step 406, such as in a Report section ofthe Expert System 130, organized according to whether the factor helpsor hurts the response. After the test results are displayed, theMarketer can decide which factors are the ones to use in the fullcampaign, and at what levels, via an interaction in step 408.

Expert System Elements

Specific elements of the Expert System 130 are now described in moredetail, including the format of Data Store 120, and how the Main Effectsand Interaction Effects are determined in the Test Evaluation Phase.

Database Design for Data Store 120

The infrastructure to enable this three phase process is a database andassociated SQL code. In addition to the tables used to storetransactions 124, products, and customer 160 information, four moretables are used to store test parameters—Tests 121, Factors 122, Recipes123, and Test Customers 124. The fields for these tables are shownbelow.

Test table 121

-   -   Test ID    -   Test Name    -   Number of Factors    -   Number of Levels    -   Number of subgroups

Factors table 122

-   -   Factor ID    -   Factor name    -   Test ID    -   Value 1    -   Value 2    -   Value 3, with one value for each level, L=2 or 3

Recipes table 123

-   -   RecipeID (key)    -   TestID    -   Factor1 (value would be a FactorID)    -   Factor2    -   Factor3 (used for a three factor test)    -   F1treatment (value is a Value1 or Value2 or Value 3 from the        Factor table; thus a recipe record states what Values are        associated with each Factor used, and Treatment is the term used        to denote that Factor/Value combination.)    -   F2treatment    -   F3treatment

Test_Customers table 124

-   -   Customer ID    -   TestID (specifies the test)    -   RecipeID (specifies the subgroup)    -   Revenue during test period    -   NumOrders

The Test_Customers table 124 associates the treatments with the subjects(customers 160). The Test_Customers table 124 holds the customers 160 inthe test, specifying in which subgroup they have been placed. This table24 also holds the results (revenue, response) from the Test. This table124 can be large.

For a one factor, split-run test, there is one record in the Factorstable 122 for the single factor. For a two factor test, there are tworecords in the Factors table 122 for a given TestID; for a three factortest there are three records for the same TestID. Each factorID has asmany treatment values (value1, value 2, . . . ) as there are levels inthe test. Thus each test 140 is specified by the number of factors, thenumber of levels, and a set of recipes.

For tests 140 with one, two, or three factors, the recipes table 123 fora given testID has two, four or eight entries. Each entry describes thetreatment for each factor. Recipe 1 would say factor A is +, Factor B is+. Recipe 2 would say Factor A is +, Factor B is −(all in a 2 factordesign). Again for L=2, the Factors table 122 will specify the twovalues of the factor. For example, if the Factor is discount level,value 1 might be 5% and value 2 could be 10%. See the chart below,where + and − represent the two levels for a given factor.

Factors Recipes Factor 1 Factor 2 R1 + + R2 + − R3 − + R4 − −

Specifically, the Recipes table 123 identifies the

-   -   recipe number for a given test    -   factors (e.g. discount level, personalization) using Factor ID    -   factor treatment for each factor (value1, value2 or value3)

For a two level, three factor design, there are eight recipes as shownin the next table.

Factors Recipes Factor 1 Factor 2 Factor 3 R1 − − − R2 + − − R3 − + −R4 + + − R5 − − + R6 + − + R7 − + + R8 + + +

Calculating Main and Interaction Effects

A. Two Factor Design

When the test is completed, these tables (121,122,123,124) are queriedto produce the test results. The object is to determine which factorshave what effects on the responses. Two kinds of effects are calculated

-   -   Main effects, which analyze the effects of each factor        individually    -   Interaction effects, which analyze the factors acting together

Using the two factor, two level design above, values are assigned to theresponses y(n) in the various cells as follows:

Factor 1+ Factor 1− Factor2+ y(1) (R1: + +) y(3) (R3: − +) Factor2− y(2)(R2: + −) y(4) (R4: − −)

Then the Main Effect of Factor 1 is determined by comparing all theFactor1+ responses with the Factor2 responses. That is, we calculate

ME(F1)=(y(1)+y(2)−y(3)−y(4))/2

Similarly,

ME(F2)=(y(1)+y(3)−y(2)−y(4))/2

The Interaction Effect between F1 and F2 is

IE(F1×F2)=(y(1)+y(4)−y(2)−y(3))/2

B. Three Factor Design

The best way to understand a three factor design is through a cube plot,where the eight responses (typically revenue) to the eight combinations(Recipes) of the three factors are plotted on the eight vertices of acube. Calling the revenue response the yield (y), we represent the eightyields as y(1), y(2), . . . , y(8), corresponding to the eight recipes.Then

ME(F1)=(y(2)+y(4)+y(6)+y(8)−y(1)−y(3)−y(5)−y(7))/4

ME(F2)=(y(3)+y(4)+y(7)+y(8)−y(1)−y(2)−y(5)−y(6))/4

ME(F3)=(y(5)+y(6)+y(7)+y(8)−y(1)−y(2)−y(3)−y(4))/4

There can be two factor interactions and three factor interactions. Thetwo factor interactions are

IE(F1×F2)=(y(1)+y(4)+y(5)+y(8)−y(2)−y(3)−y(6)−y(7))/4

IE(F1×F3)=(y(1)+y(3)+y(6)+y(8)−y(2)−y(4)−y(5)−y(7))/4

IE(F2×F3)=(y(1)+y(2)+y(7)+y(8)−y(3)−y(4)−y(5)−y(7))/4

The three factor interactions are more complex. Consider the F1×F2interaction. We can examine this interaction at the + level for Factor 3and at the − level for Factor 3. The interaction at the + level forFactor 3 is

(y(8)−y(7)−(y(6)−y(5)))/2

At the − level, it is

(y(4)−y(3)−(y(2)−y(1)))/2

The consistency of the F1×F2 interaction across variations in F3 ismeasured by the difference between these two terms. Half of thisdifference is defined as the three factor interaction between F1, F2,and F3.

All of the factorial effects are a contrast between two averages.

While this invention has been particularly shown and described withreferences to preferred embodiments thereof, it will be understood bythose skilled in the art that various changes in form and details may bemade therein without departing from the scope of the inventionencompassed by the appended claims.

What is claimed is:
 1. A method for design of a test, the methodcomprising the steps of: designing the test, using an interactive testdesign process executed in an integrated data processing environment,the test including at least a specification of test type, at least onetest group of consumers against which the test is to be run, andmultiple factors to be tested simultaneously as independent designvariables; to executing the test within an online expert system todetermine test results, the test results including responses by the atleast one test group of consumers to the multiple factors; andevaluating the test results by determining at least one of a main effector interaction effect of the multiple factors.